Back

Communications Biology

Springer Science and Business Media LLC

Preprints posted in the last 7 days, ranked by how well they match Communications Biology's content profile, based on 886 papers previously published here. The average preprint has a 0.59% match score for this journal, so anything above that is already an above-average fit.

1
Modeling cycle phases using hormone trajectories in women with and without polyendocrine metabolic ovarian syndrome

Stujenske, T. M.; Bouchard, T. P.; Troy, A.; Kelemen, S.; Folino, B.; Wills, T.; Sugden, L. A.

2026-06-04 obstetrics and gynecology 10.64898/2026.06.02.26354701 medRxiv
Top 3%
2.8%
Show abstract

The recent availability of at-home menstrual cycle tracking technology has created opportunities for personalized assessment of reproductive health, alongside improved characterization of hormone patterns in women with and without reproductive disorders such as polyendocrine metabolic ovarian syndrome (PMOS), which affects approximately 10% of reproductive-age women. In this study, we leverage self-tracked urinary hormone data to develop an autoregressive Hidden Markov model (arHMM) that maps cycle days to physiologically meaningful phases based on hormone trajectories. By modeling day-to-day hormonal dynamics rather than absolute hormone levels, and allowing variable phase durations, this approach accommodates substantial variability in menstrual cycles, thereby enabling meaningful comparisons within and between individuals. Across more than 3800 cycles from over 1100 individuals, we find that arHMM-derived phases reproduce expected hormonal patterns within follicular, periovulatory, and luteal phases, and that phase-based timing for hormone testing outperforms conventional cycle day-based testing in capturing the luteinizing hormone surge and post-ovulatory progesterone rise, highlighting limitations of fixed-day clinical protocols. We identify phase-specific differences between healthy controls and individuals with self-reported PMOS, including lower luteinizing hormone in the periovulatory phase, and reduced luteal-phase progesterone levels in PMOS. Furthermore, features derived from arHMM phase assignments enable classification of PMOS status with ~78% accuracy, demonstrating the potential of this approach for non-invasive PMOS screening.

2
Topological Deep Learning Identifies Polygenic Variant Clusters Across Familial Multimorbid Disorders

Vomo-Donfack, K. L.; Bousquet, G.; Falgarone, G.; Ginot, G.; Morilla, I.

2026-06-09 health informatics 10.64898/2026.06.03.26354242 medRxiv
Top 4%
2.4%
Show abstract

Whole-genome sequencing comprehensively captures coding, non-coding and structural variation in families with suspected inherited disorders, yet its clinical utility remains constrained by an interpretation bottleneck: selecting a handful of relevant variants from millions of candidates. Current rule-based pipelines, anchored in ACMG/AMP criteria, excel at identifying highly penetrant Mendelian alleles but frequently miss variants of low-to-moderate penetrance, non-coding alterations and germline-somatic interactions. Here we introduce PolyCLIP-T, a topology-guided multimodal framework that transforms variant selection from a classification problem into a geometric discovery task. By contrastively aligning DNA-sequence embeddings with functional annotations, PolyCLIP-T constructs a unified latent space in which the displacement between reference and alternate embeddings quantifies the molecular perturbation induced by each variant. Persistent homology then identifies stable topological components - coherent variant groups shared among affected relatives - that transcend single-variant scoring logic. Applied to six families with multi-morbid cancer, autoimmune and cardiovascular disease, PolyCLIP-T recovered non-coding and structural candidates overlooked by conventional pipelines and revealed pleiotropic networks spanning disease categories. This approach provides an interpretable, scalable solution for genome-first investigations of disorders driven by polygenic architectures that evade single-variant analysis. The framework was developed and benchmarked on deeply characterised familial cohorts selected for transgenerational multimorbidity; validation in larger, independent populations will be essential to establish its generalisability. An interactive web tool is freely available at https://www.polyclip-t.uma.es/.

3
BodyMAE: A Surface-Area Aware Masked Autoencoder for Body Composition Estimation from 3D Body Scans

Zheng, Y.; Feng, B.; Cheng, R.; Qiu, C.; Long, Z.; Vaziri, K.; Hahn, J.

2026-06-06 health informatics 10.64898/2026.06.04.26354925 medRxiv
Top 5%
2.0%
Show abstract

Accurate assessment of body composition is important to risk stratification and management of metabolic, musculoskeletal, and aging-related diseases, yet reference modalities such as Dual-energy X-ray absorptiometry (DXA) are costly and impractical for frequent monitoring. Commodity 3D body scans offer a low-cost, radiation-free alternative, but extracting meaningful and predictive shape features from scans remains challenging due to nonuniform point density, variable body size and cross-device differences. We introduce BodyMAE, a self-supervised, surface-area aware masked autoencoder for metric-scale 3D body scans. The pipeline integrates area-adjusted sampling, a long-range focused encoder, and a lightweight decoder regularized to promote locally uniform reconstructions. Trained and evaluated on 917 paired 3D body scans paired with clinical DXA reports, BodyMAE achieves strong accuracy on fat percentage (root-mean-square error (RMSE) 3.825 percentage points, R^2 0.908), fat mass (RMSE 3.694 kg, R^2 0.968), and lean mass (RMSE 3.608 kg, R^2 0.901), with competitive performance on bone mineral content (RMSE 0.284 kg, R^2 0.754).We also assess feature stability across pretrained baselines, finding higher retrieval accuracy for our representations (Top-1 90.131%). These results indicate that combining metric-aware sampling, long-range relational encoding, and local geometric regularization enables accurate body composition estimation from 3D body scans, as validated by comparisons to DXA-derived measurements.

4
Towards the Virtual Amyotrophic Lateral Sclerosis Patient: Inferring Cortical Excitability through Whole-Brain Dynamical Modeling

Angiolelli, M.; Demuru, M.; Lopez, E. T.; Hashemi, M.; Ziaeemeh, A.; Rabuffo, G.; Trojsi, F.; Granata, C.; Tafuri, D.; De Luca, M.; Gallo, E.; Jirsa, V.; Depannemaecker, D.; Sorrentino, P.

2026-06-10 neurology 10.64898/2026.06.09.26354829 medRxiv
Top 6%
2.0%
Show abstract

Amyotrophic lateral sclerosis (ALS) is increasingly recognized as a multisystem neurodegenerative disorder in which motor-neuron degeneration is accompanied by widespread alterations in cortical dynamics. Among its most reproducible neurophysiological signatures is cortical hyperexcitability, yet how this local excitability imbalance shapes distributed whole-brain activity remains poorly understood. Here, we combined source-reconstructed resting-state MEG data, tractography-informed whole-brain modeling, and simulation-based inference to investigate whether ALS-related alterations in large-scale brain dynamics can be mechanistically explained by changes in cortical excitability. First, we characterized empirical brain dynamics using complementary features spanning regional activity amplitude and variability, functional connectivity, and avalanche-based metrics. These analyses revealed significant alterations in ALS patients relative to healthy controls, as well as associations with clinical impairment and disease staging. To mechanistically interpret these changes, we employed a reduced Wong-Wang whole-brain model in which local recurrent excitation modulates emergent large-scale neural dynamics. Simulations showed that increasing excitability systematically reproduced the empirical dynamical signatures observed in ALS. We then applied a simulation-based inference framework to estimate latent excitability parameters directly from empirical observations. Whole-brain model inversion revealed increased excitability in ALS patients compared with controls. The recovered excitability parameter was associated with disease staging, supporting its clinical relevance as a model-derived descriptor of ALS progression. Finally, by extending the model to estimate frontal and non-frontal excitability separately, we found that ALS-related alterations were predominantly associated with increased frontal excitability, whereas non-frontal regions appeared comparatively less affected. The recovered parameters related to disease staging. Together, these findings provide a mechanistic framework linking altered large-scale brain dynamics in ALS to selective cortical hyperexcitability, explaining how local excitability changes can give rise to global network reorganization. More broadly, they show how computational model inversion can recover latent multiscale pathophysiological processes from empirical neural recordings, offering a non-perturbative alternative to complex experimental paradigms typically required to causally probe local-to-global mechanisms.

5
Human genetic evidence links serine biosynthesis to diabetic peripheral neuropathy

Fridman, V.; Kakar, A.; Jensen, A.; Van de Vondel, L.; Wheeler, A.; Phillips, L. S.; Zhou, J.; Zuchner, S.; Reusch, J.; Raghavan, S.

2026-06-10 genetic and genomic medicine 10.64898/2026.06.09.26355286 medRxiv
Top 9%
1.7%
Show abstract

Diabetic peripheral neuropathy (DPN) is a common and disabling condition for which no disease-modifying therapies are available. Glycemic and metabolic drivers do not fully explain why only a subset of individuals with diabetes develop DPN, and genetic contributors remain poorly defined. We aimed to perform a multi-population genome-wide association study (GWAS) of DPN to highlight potential new etiological pathways and therapeutic targets. Methods We performed a multi-population GWAS of neuropathy in people with and without diabetes using the VA Million Veteran Program and UK Biobank, followed by replication in the All of Us Research Program (AoU), and gene-based and gene-set analyses to identify implicated pathways. Causal relationships between circulating serine levels and DPN were further tested using two sample Mendelian randomization. To further evaluate pathogenic potential, we analyzed rare, high impact variants in GWAS implicated genes among individuals with unresolved inherited neuropathies using the GENESIS platform. Findings Among individuals with type 2 diabetes, we identified seven genome wide significant loci (p<5x10-): PHGDH and PSPH (key serine synthesis genes), TEAD1, CYP4F11, LARGE1, FTO, and COBLL1. No loci were significant in individuals without diabetes or with type 1 diabetes. Four loci (PHGDH, TEAD1, FTO and CYP4F11) replicated in AoU (p <0.05). Mendelian randomization demonstrated that higher genetically predicted serine levels were associated with lower DPN risk, consistent with a causal role of serine metabolism in disease pathogenesis. Rare-variant burden analyses revealed associations of predicted deleterious variants with inherited neuropathy case status in PHGDH (odds ratio [OR] 12.7 [95% CI 7.9, 20.4]), PSPH (OR 8.5 [7.2, 10.2]), PHKG1 (OR 4.8 [3.7, 6.3]), and LARGE1 (OR 0.007 [0.0004, 0.1]). Interpretation Convergent genetic evidence across common and rare variation implicates serine synthesis as a key pathway in DPN. These findings link diabetic and inherited neuropathies through a shared metabolic mechanism, identifying serine metabolism as a potential therapeutic target.

6
Aperiodic and oscillatory activity of the human brain during induced emotional states

Park, H.; Hacker, C.; Cho, H.; Xie, T.; Simmons, A.; Tan, G.; Leuthardt, E. C.; Brunner, P.; Willie, J.

2026-06-09 neurology 10.64898/2026.06.02.26354146 medRxiv
Top 9%
1.7%
Show abstract

Normal emotional experience depends on dynamic modulation of neural excitability across limbic and prefrontal circuits, yet the spectral markers that reflect these shifts in humans remain incompletely understood. In this study, we combined a validated video-based emotion induction paradigm with stereotactic electroencephalography (SEEG) in 31 patients with drug-resistant epilepsy to investigate how positive and negative affective states modulate oscillatory and aperiodic (asynchronous) neural activity. Using spectral parameterization to dissociate oscillatory power from the aperiodic 1/f component, we found that emotional valence robustly altered the aperiodic slope in a regionally specific manner: negative valence flattened the slope in thalamus, posterior insula, and posterior cingulate cortex, whereas positive valence produced flattening in dorsolateral prefrontal cortex. Simultaneous oscillatory changes included increased high-frequency activity and decreased alpha/beta power during negative affect, and reduced alpha power during positive affect, which were elucidated after adjusting for broadband aperiodic spectral shifts. These effects persisted after controlling for audiovisual stimulus or physiological features and were not evident in simultaneously recorded scalp EEG, underscoring their localization to intracranial sites. Together, these results provide the first direct evidence that active induction of emotional states modulates the aperiodic slope of human intracranial field potentials, reflecting valence-dependent shifts in local circuit excitability. The findings highlight the 1/f slope as a sensitive neural marker of affective brain states and for mood dysregulation.

7
STELLAR: A flexible ensemble learning framework integrating rare variants to enhance polygenic risk prediction

Chen, T.; Li, X.; Mazumder, R.; Zhang, H.; Lin, X.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.07.26355109 medRxiv
Top 11%
1.4%
Show abstract

Whole-exome and whole-genome sequencing technology has enabled the discovery of rare genetic variants associated with human health and diseases. However, existing statistical methods used for rare variant association testing are not well-suited for building genetic risk prediction models that jointly incorporate rare and common variants. We propose STELLAR, a flexible ensemble learning-based approach to compute rare variant polygenic risk scores (PRS) using association summary statistics to enhance conventional common variant PRS. Our method combines burden-based and penalty-based rare variant analysis and leverages functional annotation information to prioritize potentially causal variants within the prediction models. In simulation studies, PRS using STELLAR consistently showed the highest prediction accuracy compared to models using common variants alone or rare variant burdens. Applied to UK Biobank whole-exome sequencing data (n=310,831) across eight continuous and five binary traits, STELLAR significantly improved prediction accuracy, refined stratification of individuals at the highest genetic risk beyond common variants, and prioritized biologically relevant genes. STELLAR provides a scalable strategy to incorporate rare variants into PRS in addition to common variants, advancing precision risk prediction and enabling more comprehensive assessment of genetic contributions to complex diseases.

8
Stochastic Morphodynamics of the Human Aorta Across the Lifespan

Twohig, K. C.; Mansour, M.; Pugar, J. A.; Yuan, K.; Pocivavsek, L.; Klishin, A. A.

2026-06-08 surgery 10.64898/2026.06.05.26355015 medRxiv
Top 12%
1.3%
Show abstract

Biological systems evolve as continuous dynamical processes, but at organ-scale and across human lifespans they are rarely observed longitudinally--population data typically exist instead as sparse, cross-sectional snapshots. Inferring lifespan dynamics from such data requires methods distinct from those used at cellular and tissue scales where dense observations are accessible. We address this problem in the thoracic aorta, where surgical decisions currently rest on static, age- and sex-agnostic diameter thresholds that reduce three-dimensional morphology to a single scalar. Treating normal aortic morphology as a stochastic dynamical system, we pose a continuous-time drift-diffusion process in a two-coordinate state space of normalized surface area (A) and normalized fluctuation in integrated Gaussian curvature ({delta} K), and fit closed-form solutions of the Fokker-Planck equation by maximum likelihood to a sex-balanced, age-uniform cohort spanning infancy to age 99. Inter-individual variability is treated as a fitted diffusion parameter rather than as residual scatter, which is distinct from prior normative studies that report variability as scatter around a regression line. The framework identifies two growth regimes for aortic size (childhood expansion followed by persistent adult growth, with adult males growing approximately 70% faster than adult females) and a single dynamical regime for aortic shape, with heteroscedastic variability accumulating at a rate comparable to the mean drift over the lifespan. Applied to independent cohorts of acute and chronic thoracic aortic dissections, the multivariate model identifies over 95% as statistical outliers via Mahalanobis distance, consistently outperforming either coordinate alone. The same probabilistic envelope that describes normal aging thus defines a baseline against which disease can be detected, supporting a shift toward dynamic, age- and sex-aware assessment of thoracic aortic pathology.

9
A Hierarchical Visual EEG Framework for the Assessment of Disorders of Consciousness

Chen, Y.; Ge, Q.; Li, H.; Kang, X.; Chen, Q.; He, W.; Sun, Y.; Zhang, S.; Laureys, S.; Chen, X.; He, J.; Gao, X.

2026-06-05 neurology 10.64898/2026.06.04.26354678 medRxiv
Top 13%
1.3%
Show abstract

The objective assessment of patients with disorders of consciousness (DOC) remains a significant clinical challenge. Behavioral scales like the Coma Recovery Scale-Revised (CRS-R) are susceptible to rater subjectivity and have difficulty in detecting patients with cognitive-motor dissociation (CMD), while existing electrophysiological paradigms typically evaluate isolated processing levels, especially in visual functions. To address these limitations, we developed a novel, hierarchical visual EEG framework that evaluates three progressive tiers of visual processing--sensory input, selective attention, and object discrimination--within a single, unified paradigm. This framework uses steady-state and event-related potentials, analyzed with statistical testing and machine learning, to provide objective detection. In a cohort of 85 participants, the framework demonstrated a robust alignment with behavioral CRS-R levels and successfully identified CMD patients missed by bedside behavioral examinations. Notably, model predictions derived from this framework showed a significant correlation with 3-month clinical outcomes. This prognostic utility generalized effectively and remained consistent across distinct EEG acquisition systems in an independent validation cohort of 17 patients. In summary, this work offers electrophysiological validation for the hierarchical design of the CRS-R and provides a practical tool for bedside objective assessment of DOC.

10
Watching the FIFA World Cup and Adult Sleep Quality: A Cross-Sectional Online Survey

Aljamaan, F.; Alanteet, A. A.; Chaiah, Y.; Dasuqi, S. A.; Alarabi, M. A.; Saeed, E.; Al-khatib, S. M.; Darweesh, A. A.; Raina, M.; Saad, K.; Alhasan, K.; BaHammam, A. S.; Temsah, M.-H.

2026-06-08 sports medicine 10.64898/2026.06.07.26355072 medRxiv
Top 15%
1.2%
Show abstract

Major international sporting events frequently impose exogenous demands that challenge adult circadian rhythms, often leading to the misalignment of sleep-wake cycles and social schedules. This cross-sectional study investigated the impact of the FIFA 2022 World Cup on adult sleep patterns to assess the prevalence and determinants of tournament-associated circadian disruption. Through an online survey, we captured data on sleep duration, timing, and subjective quality from a diverse adult population using Pittsburgh Sleep Quality Index (PSQI) score. The results indicate that 81.3% had high problematic sleep according to PSQI scores, while only 9% perceived that their sleep pattern was impacted by watching matches during the tournament. While 83.7% of the participants had low or mild anxiety according to GAD-7 scores, we found that GAD-7 scores correlated significantly with PSQI scores. Married participants had significantly lower PSQI scores (RR 0.856, p = .005), while those who reported that their sleep hours had changed during the tournament had significantly higher PSQI scores (1.180, P-value <0.001). Males reported a significantly high impact of the tournament on their sleep (OR 2.622, P-value <0.001). In conclusion, our data demonstrate a discrepancy between self-perception of sleep quality and self-rated assessment by PSQI scores, as well as the substantial impact of major international sporting events on adult sleep hygiene. The results provide data-driven insights helpful in evaluating potential circadian risks and informing public health strategies for major sporting events such as the FIFA world cup.

11
Heart Rate Circadian Oscillations as Digital Biomarkers of Cardiometabolic Health Determinants

Colitta, A.; Bruno, S.; Benedetti, D.; Hoxhaj, D.; Cruz-Sanabria, F.; Di Pede, C.; Buracchi Torresi, F.; Frumento, P.; Gargani, L.; Fabbrini, M.; Maestri Tassoni, M.; Bonanni, E.; Faraguna, U.

2026-06-10 cardiovascular medicine 10.64898/2026.06.07.26355124 medRxiv
Top 18%
0.9%
Show abstract

AIMS Cardiometabolic risk factors may impair health by altering the autonomic modulation of the cardiovascular system, a physiological process described by heart rate (HR) circadian oscillations. However, the impact of cardiometabolic health determinants on HR circadian oscillations remains scarcely characterized in real-world, population-based settings. To address this, we applied digital health technologies to investigate how cardiometabolic health determinants shape HR circadian oscillations in a real-world cohort of individuals free of cardiometabolic diseases. METHODS First, a 10-fold cross-validation of a model was performed, aiming at mitigating wearables measurement error caused by motion artifacts. This process was informed by 10,056 epochs of concurrent wearable-derived and polysomnographic HR assessment, yielding an average 1.3 bpm reduction in wearables measurement error. We subsequently applied this model to over 2 million 1-minute epochs of HR data, derived from 7-day continuous actigraphic recordings of 245 individuals free of cardiometabolic disorders. Functional-on-scalar regression modelling and both parametric and nonparametric analyses characterized HR circadian profiles and their relationships with demographics, lifestyle, chronotype, sleep health, and chronic insomnia diagnosis. A 6-dimension sleep health index was calculated. RESULTS Sex, chronotype, and sleep health predominantly shaped HR circadian oscillations. In detail, females consistently showed higher HR across the 24 hours. Moreover, chronotype was associated to a phase shift in HR circadian profiles, with later timings corresponding to eveningness. Notably, sleep health impacted HR circadian oscillations in a dose-dependent fashion: each additional impaired sleep dimension was associated with a 1.2 bpm HR increase during nighttime, alongside reduced circadian robustness and delayed oscillation timings. Finally, the earlier occurrence of morning HR peaks served as a digital biomarker of insomnia (80% specificity, 74% sensitivity). CONCLUSIONS This work provides a digital health framework to characterize HR circadian oscillations in free-living populations and supports its clinical utility in capturing the autonomic disruptions related to cardiometabolic health determinants.

12
TACR3 variant confers resilience to aging and Alzheimer's disease

Ruffini, N.; Fischer, F. U.; Subirana Slotos, R.; Goschke, J.; Scholz, L.; Knaepen, K.; Huettelmaier, S.; Morrison, H.; Steffan, T.; Pabst, A.-S.; Winter, J.; Baier, B.; Mierau, A.; Binder, H.; Drzezga, A.; Teipel, S.; Fellgiebel, A.; Endres, K.; Tuescher, O.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.06.26355071 medRxiv
Top 19%
0.9%
Show abstract

Background: While genetic factors strongly influence brain aging trajectories, variants conferring cognitive resilience remain poorly characterized. The neurokinin-3 receptor (NK3-R), encoded by Tachykinin Receptor 3 (TACR3), modulates cholinergic signaling in memory circuits vulnerable to aging. Previous studies linked the non-WT expression of the TACR3 variant rs2765 with cognitive decline and reduced volume of the hippocampus and basal forebrain, but systematic replication and mechanistic validation were lacking. Methods: We investigated rs2765 in the preregistered AgeGain cohort of cognitively healthy older adults (n=188) with independent validation in the ADNI cohort (n=809) which includes persons with and without Alzheimers Disease (AD) that show healthy cognition, mild cognitive impairment or dementia. Analyses integrated structural neuroimaging, longitudinal cognitive assessments, epigenetic aging (PhenoAge), genome-wide methylation profiling, and mechanistic validation through luciferase assays and cross-species protein expression studies. Results: The infrequent protective rs2765 WT variant, found in 12.8% of Europeans, conferred 49% slower cognitive decline (p = 0.002) for amyloid-positive individuals of the ADNI cohort and 3.7 years younger epigenetic age (p = 0.013, 95% CI: 0.79-6.67 years) in the cognitively healthy AgeGain cohort. WT carriers showed larger hippocampal and basal forebrain volumes across cohorts, with Allen Brain Atlas integration revealing these outcomes to occur exclusively in regions where TACR3 expression positively correlated with gray matter volume. Mechanistically, the non-WT variant ameliorated RBMX-mediated post-transcriptional regulation, reducing NK3-R protein expression by 25-40% in vitro and ex vivo murine brain slice models. Senescence-accelerated mice exhibited reduced endogenous NK3-R expression, phenocopying the predicted functional consequences of the variant. In AgeGain participants, genome-wide methylation profiling identified 2,313 differentially methylated CpGs affecting 228 pathways spanning glutamatergic signaling, acetylcholine receptor pathways, chromatin remodeling, and angiogenesis, suggesting coordinated molecular reprogramming from synaptic function to systemic aging. Conclusions: rs2765 WT confers resilience to age- and AD-related cognitive decline through RBMX-dependent regulation of NK3-R expression, with effects of remarkable size cascading from memory to systemic aging. rs2765 genotyping could stratify individuals for NK3-R modulator therapy (e.g., fezolinetant or senktides) and identify those maintaining function despite pathological burden, complementing APOE-based risk assessment in precision geromedicine.

13
Contextualizing the Utility of Polygenic Risk Scores using Absolute Risk Models in Diverse Ancestry Populations

Chatterjee, N.; Martina, F.; Kachuri, L.; Natarajan, P.; Witte, J.; Huo, D.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354842 medRxiv
Top 19%
0.9%
Show abstract

Polygenic risk scores (PRSs) are emerging as powerful tools for quantifying inherited risk for common diseases and, in some cases, are approaching clinical implementation. A major concern for PRS implementation is their limited accuracy in non-European populations, particularly in those of African ancestry. However, past evaluations have focused on metrics such as relative risk or AUC, which do not capture background risk arising from contextual factors. We introduce a novel measure of variable importance, the conditional average derivative estimator (CADE), to evaluate PRS utility across diverse contexts and populations within absolute risk models that integrate PRSs with other relevant risk factors. We illustrate this framework by integrating PRSs for breast and prostate cancer within age-specific absolute risk models for incidence and mortality fit using individual-level data from the All of Us Research Program with inputs from the National Cancer Institute SEER cancer registry. Our projections show that although the PRSs are known to have the lowest discriminatory accuracy in African Americans (AA), there are contexts in which they provide greater utility, such as for the stratification of prostate cancer risk and mortality, where the CADE values for AA were 2- and 7-fold higher than for European Americans. These findings suggest that conclusions about the limited clinical utility of PRS in non-European populations may be premature and underscore the need to quantify PRS risk-stratification utility at the absolute-risk level, while accounting for disease onset, survival, and broader health and economic factors.

14
STDP-inspired temporal transition modeling for adaptive clinical risk prediction from electronic health records

Gong, L.; Aswani, N.; Shahinian, P.; Yang, J. Y.; Kontos, D.; Manji, G.; Kang, S.; Hur, C.

2026-06-09 health policy 10.64898/2026.06.04.26354919 medRxiv
Top 21%
0.8%
Show abstract

Electronic health record (EHR) prediction models often summarize longitudinal histories as static patient-level features, which may omit potentially informative event ordering. We developed a simplified spike-timing-dependent plasticity (STDP)-inspired framework that represents asynchronous EHR data as sparse, directional transition features. The approach encodes whether one clinical event precedes another within prespecified temporal windows, preserving event identity, directionality, and approximate timing while retaining feature-level interpretability. We evaluated this framework in two retrospective prediction tasks with different temporal scales: incident acute kidney injury (AKI) prediction in 17,351 MIMIC-IV ICU stays and early postoperative recurrence prediction in 713 CUMC patients with pancreatic ductal adenocarcinoma (PDAC). Models were compared with static burden features (demographics, comorbidities, raw lab measurements) and in addition with STDP transitional feature sets using patient-level cross-validation and rolling prediction horizons. In AKI, a calibrated STDP ensemble model showed higher discrimination than static burden alone at the 24-hour decision snapshot for AKI by 72 hours, with AUROC 0.838 versus 0.800, and at 48 hours for near-term AKI prediction, with AUROC 0.868 versus 0.827. In PDAC, STDP transition features modestly improved Day -30 preoperative recurrence prediction, with AUROC 0.611 versus 0.587 and AUPRC 0.323 versus 0.318 for static burden and showed similar performance at Day 0 (7 days before recorded surgery date), with AUROC 0.681 and AUPRC 0.363. Decision-curve and feature analyses suggested that selected temporal transitions were clinically interpretable across renal, inflammatory, hepatobiliary, hematologic, glycemic, and nutritional trajectories. These findings suggest that STDP-inspired transition features may provide a practical, interpretable way to incorporate temporal ordering into EHR-based risk prediction across both acute and longitudinal settings

15
ECG-derived age deviation predicts cardiovascular diseases across lead configurations and cohorts

Aydogdu, D.; Gaber, F.; Sorooshmehr, A.; Akalin, A.

2026-06-08 cardiovascular medicine 10.64898/2026.06.05.26354974 medRxiv
Top 21%
0.8%
Show abstract

Cardiovascular diseases (CVDs) remain the primary global health burden, motivating the search for robust, non-invasive risk biomarkers. We harness a foundation model pretrained on over 10 million recordings, to evaluate ECG-derived age deviation as a cross-cohort biomarker of CVD burden. A predictive model, trained exclusively on healthy subjects, achieved accurate age prediction. Diseased subjects exhibited significant positive age acceleration across multiple categories, with structural and ischemic heart diseases showing the largest effects. External validation in a hospital-based cohort (n=160,493) confirmed that age acceleration independently predicts all-cause mortality, with the strongest prognostic value in patients under 65 years. Furthermore, we demonstrated that disease discrimination and mortality prediction are preserved across 6-lead and single-lead configurations, supporting potential deployment in wearable or mobile devices. Our analysis also revealed a striking morphological confound from the complete left bundle branch block, leading us to propose absolute age deviation as a more robust, universal risk marker. These findings establish ECG-derived biological age deviation as a highly generalizable and clinically actionable biomarker for assessing cardiovascular risk. We have also developed a web application at https://bioinformatics.mdc-berlin.de/ECGage that allows users to easily test our framework.

16
Early assessment of potential airline-mediated importation risk during the 2026 DRC-Uganda Bundibugyo virus disease outbreak

Kinoshita, R.; Suzuki, M.; Yoneoka, D.

2026-06-09 public and global health 10.64898/2026.06.01.26354569 medRxiv
Top 21%
0.8%
Show abstract

During the 2026 Bundibugyo virus disease outbreak in the Democratic Republic of the Congo and Uganda, we projected potential airline-mediated importation risk using contemporary airline network and an externally calibrated Ebola importation hazard. Effective-distance analyses identified major international hub countries, including Belgium, France, South Africa, Kenya, and the United Arab Emirates, as higher-probability gateways within 30 days. These early projections provide a reproducible framework for real-time international situational awareness, while emphasizing that importation risk does not imply local transmission risk.

17
Exploratory dried blood spot metabolomics identifies pathway-level convergence with ME/CFS biology in a self-reported PEM-like fatigue phenotype

Hauguel, P.; Anctil, N.; Noel, L.-P.

2026-06-10 rheumatology 10.64898/2026.06.08.26355197 medRxiv
Top 21%
0.8%
Show abstract

Background. Plasma and serum metabolomic studies of myalgic encephalomyelitis / chronic fatigue syndrome (ME/CFS) have repeatedly implicated hypometabolic, lipid, mitochondrial, redox and tryptophan-kynurenine pathways, but prior cohorts have been modest in size and have used heterogeneous case definitions. Whether similar pathway-level signals are detectable at scale in dried blood spots (DBS), across questionnaire-derived fatigue constructs and across orthogonal LC gradients in the same individuals remains unresolved. Methods. We profiled DBS extracts from 1,784 community-cohort adults by reverse-phase LC-MS using paired 5 min and 15 min gradients. Six questionnaire-derived endpoints captured a pragmatic self-reported PEM-like phenotype, a DSQ-derived PEM-like construct, high or review clinical status, temporal fatigue state, comorbid fatigue and self-reported chronic fatigue. The locked primary endpoint for Phase 1 was pragmatic_fatigue_pem with 226 cases and 914 controls after excluding major metabolic comorbidity. We tested a biology-first panel comprising 22 literature-curated metabolites represented by four participant-level descriptors each, and evaluated three discovery extensions: a targeted m/z search of additional literature candidates, a hypothesis-free univariate screen across 4,553 5 min and 5,625 15 min consensus features, and pairwise z-difference ratios. Endpoint-specific Ridge classifiers were evaluated by five-fold out-of-fold AUC with bootstrap stability filtering. Cross-gradient agreement was assessed by per-metabolite AUC concordance between paired 5 min and 15 min profiles. Severity was modelled as an ordinal grade derived from the number of fatigue criteria met and chronic-fatigue-form status. Results. The biology-first DBS panel achieved out-of-fold AUC 0.81 for the pragmatic self-reported PEM-like endpoint (226 cases / 914 controls). The DSQ-derived PEM-like construct reached AUC 0.60 (57 cases / 201 controls) on the un-filtered set and AUC 0.778 (SD 0.013, twenty seeds) in a post-hoc signature-decomposition follow-up restricted to participants without a self-declared major-metabolic-history tag (29 cases / 230 controls); both are treated as construct-validity anchors rather than as provoked or clinically adjudicated PEM. An optimised operationalisation of the same construct (panel-self normalisation, restriction to non-comorbid participants and demographic covariates) reached AUC 0.71 (95 % CI 0.55 to 0.76), and an exploratory age-stratified signature decomposition suggested age-dependent pathway composition that requires confirmation given small per-stratum case counts. Stable contributors mapped to carnitine-shuttle, TCA-cycle, redox-thiol and tryptophan-kynurenine pathways. Cross-gradient analysis of 22 matched metabolites yielded Pearson r = 0.62 for signed univariate effects (p = 0.002; 68 % directional agreement). The metabolomic score increased with severity grade (Spearman rho = 0.45, p = 4 x 10^-91; median scores 0.24, 0.51 and 0.75 across grades 0, 1 and 2). Sensitivity analyses on the covariate-complete subset (n = 565; 138 cases / 427 controls) showed that the DBS signal was robust to adjustment for age, sex, BMI and medication burden (DBS-only AUC 0.76, DBS plus covariates 0.78, covariates only 0.64), and produced a metabolomic-specific lift of approximately 0.13 AUC over the strongest anti-leak declarative cross-form questionnaire baseline (AUC 0.63). DBS-only AUC was stable across sex, age and BMI subgroups, and a 1:4 nearest-neighbour matched analysis on age, sex and BMI yielded AUC 0.72 (95 % CI 0.67 to 0.77). The observed pattern supported pathway-level convergence with prior ME/CFS metabolomics literature, including carnitine shuttle, fatty-acid beta-oxidation, TCA cycle, redox-thiol, urea cycle, glycerophospholipid and tryptophan-kynurenine axes. In contrast, the hypothesis-free 15 min screen produced high-AUC features that mapped predominantly to environmental or technical signals, including pesticide, industrial-amine and mobile-phase artifact annotations; only one of eight top leads, a truncated oxidised phospholipid, was biologically plausible, and none had tandem-MS support. Conclusions. In this large community cohort, a literature-curated DBS metabolomic panel captured pathway-level biology associated with a questionnaire-derived PEM-like fatigue phenotype, showed directional concordance across LC gradients, scaled with symptom severity and remained robust to key demographic, anthropometric and anti-leak questionnaire baselines. The findings converge with several metabolic axes previously reported in ME/CFS plasma and serum studies, including carnitine-shuttle, TCA-cycle, redox-thiol, urea-cycle, glycerophospholipid and tryptophan-kynurenine pathways. They should not be interpreted as clinical validation of a diagnostic test, screening tool or objective provoked-PEM biomarker. Rather, they support at-home-compatible DBS metabolomics as a biologically grounded platform for future clinically adjudicated validation, decision-support development and longitudinal monitoring in fatigue and PEM-like syndromes. Because DBS contains cellular and plasma-derived components, matrix effects must be considered when comparing individual metabolites with venous plasma or serum studies, and hypothesis-free screening at this scale can preferentially surface exposome or technical variance unless molecular identification is enforced before biological interpretation.

18
Beyond event-rate enrichment: proteomic risk scores for mechanism-aware prevention trial design

Fieggen, J.; Simond, G.; Segal, B. M.; Noori, A.; Thakurta, A.; Butler, C. C.; Clifton, D. A.; Clifton, L.

2026-06-10 health informatics 10.64898/2026.06.09.26355266 medRxiv
Top 22%
0.8%
Show abstract

Background. Blood-based biomarkers are increasingly proposed for identifying high-risk individuals before clinical disease and for making prevention-oriented trials more efficient. Prognostic enrichment can increase event rates, but trial efficiency also depends on whether the intervention effect is preserved in the enriched population. Methods. Using the UK Biobank Pharma Proteomics Project, we trained disease-specific proteomic risk scores (ProRS) from 2,916 plasma proteins with elastic-net Cox models. We compared ProRS, polygenic risk scores (PRS), and combined PRS--ProRS scores across ten incident diseases. We estimated cumulative incidence and theoretical two-arm time-to-event trial sample sizes across risk strata. To evaluate effect preservation, we examined six intervention-analogue exposure--outcome pairs spanning genetic (PCSK9/coronary artery disease, APOE/Alzheimer's disease, PPARG/type 2 diabetes, IL23R/Crohn's disease), behavioural (physical activity/all-cause mortality), and pharmacological (RAAS inhibitors versus calcium channel blockers/coronary artery disease) examples. Results. ProRS outperformed PRS for 9 of 10 diseases (median C-index 0.75 versus 0.61). ProRS and PRS were weakly correlated (median Pearson |r| = 0.04), and joint PRS--ProRS stratification identified groups with higher observed incidence than either score alone for several endpoints. In the top risk quartile, combined-score enrichment reduced theoretical required sample sizes by 32--74\% under a fixed 20\% relative hazard reduction. These gains were not always preserved when stratum-specific intervention-analogue effects were used. Effects were broadly preserved for APOE/Alzheimer's disease and physical activity/mortality. The PPARG/type 2 diabetes effect attenuated toward the null under all three score types, showing that event-rate enrichment does not guarantee effect preservation. For IL23R/Crohn's disease and the antihypertensive comparison, point estimates differed across score types -- preserved under polygenic but attenuated under proteomic enrichment -- but confidence intervals were wide and overlapping. Conclusions. Proteomic risk scores can identify high-event-rate populations for prevention-oriented trials, but event-rate enrichment alone is insufficient for trial design. Biomarker-guided enrichment should evaluate mechanism-specific effect preservation and may be preferable as a stratification or adaptive-design variable rather than as a restrictive eligibility criterion.

19
Context-Dependent Age-Group performance hierarchies limit fairness interventions in PPG-based heart rate prediction

Panchumarthi, L. Y.; Kataria, S.; Wu, Y.; Hu, X.; Fedorov, A.; Kwak, H. G.

2026-06-05 health informatics 10.64898/2026.06.04.26352929 medRxiv
Top 23%
0.8%
Show abstract

Background. Fairness-aware machine learning increasingly targets demographic performance disparities in clinical prediction, yet whether standard bias mitigation strategies genuinely improve equity in physiological signal analysis remains unclear. Age-based disparities in photoplethysmography (PPG)-based heart rate prediction present a particular challenge, as age-related performance differences may reflect context-dependent physiological structure rather than correctable artifacts. Methods. We evaluated three fairness interventions, inverse-frequency weighting (IF), Group Distributionally Robust Optimization (GroupDRO), and adversarial debiasing (ADV), applied via fine-tuning of a PPG foundation model across three clinical datasets spanning intensive care unit, laboratory, and consumer wearable contexts. Outcomes were assessed using a 2x2 framework classifying each intervention-dataset combination by the joint direction of change in mean absolute error (MAE) and fairness gap (FG) across age groups, yielding four outcome types: genuine improvement (G), leveling down (L), selective benefit (S), and both worse (W). Results. Across nine intra-domain conditions, no intervention simultaneously improved both MAE and FG (0/9 genuine improvement). The dominant pattern was leveling down (5/9): FG decreased but was accompanied by MAE degradation, indicating that apparent fairness gains were achieved at the cost of overall predictive performance. Age-group difficulty ordering varied across clinical contexts at baseline and was not preserved under intervention. In 18 cross-domain transfer conditions, genuine improvement was rare (4/18) and observed exclusively in non-MIMIC source configurations; models fine-tuned on MIMIC-sourced data yielded no genuine improvements (0/6). Embedding-level representation changes following fine-tuning did not reliably predict fairness outcomes. Conclusions. Age-based fairness interventions in PPG heart rate prediction indicate a leveling-down pattern rather than genuine equity improvement, suggesting that age-related performance gaps reflect context-dependent physiological structure not fully addressable through standard bias mitigation. Cross-domain transfer further amplifies this instability. These findings suggest that fairness evaluation frameworks for age-stratified physiological prediction should account for context-dependent performance structure rather than treating observed gaps as correctable bias.

20
Estimating COVID-19 Cumulative Incidence from Seroprevalence Surveys accounting for Time-Varying Seroreversion: A Fully Bayesian Methodology

Owusu-Boaitey, N.; Meyer, M. J.; Herrera-Esposito, D.; Bottcher, L.; Lukz, M.; Cook, S.; Stoto, M. A.; Kraemer, J. D.

2026-06-10 epidemiology 10.64898/2026.06.09.26355264 medRxiv
Top 24%
0.7%
Show abstract

Seroprevalence surveys reveal the extent of humoral immunity against pathogens such as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), and under some circumstances represent cumulative incidence of prior infection. However, antibody waning - or seroreversion - biases these estimates by reducing assay sensitivity in a time-varying manner. Because assay sensitivity decays over time, naively using serosurveys can substantially bias estimates of SARS-CoV-2 cumulative incidence and fatality rates. The Bayesian assay-specific, time-varying sensitivity adjustment developed in this paper can reliably correct for this bias and account for the delay between infection and serosurvey. In seroprevalence studies conducted in the United States in 2020, adjusting for time-varying sensitivity increased cumulative incidence by up to 1.4-fold, with an adjustment of 1.08 for a national study. Our estimates contrast with a previously published 2-fold adjustment that did not account for assay design. This suggests that previous analyses overestimated cumulative incidence by applying seroreversion corrections that did not account for assay-specific effects, or underestimated cumulative incidence by not applying seroreversion corrections. These biases imply fatality rate underestimation and overestimation, respectively. Our model provides a framework for design-specific time-varying sensitivity corrections in seroprevalence surveys for other pathogens.